Translation Events in Cross-language Information Retrieval: Lexical Ambiguity, Lexical Holes, Vocabulary Mismatch, and Correct Translations

نویسندگان

  • ANNE R. DIEKEMA
  • Anne Roel Diekema
  • Barbara Kwasnik
  • Liz Liddy
  • Jeffrey Katzer
  • Arie Noordzij
  • Jiangping Chen
  • Ted Diamond
  • Wen Hsiao
  • Wessel Kraaij
  • Farhad Oroumchian
  • Miguel Ruiz
  • Arvind Srinivasan
چکیده

Cross-Language Information Retrieval (CLIR) systems enable users to formulate queries in their native language to retrieve documents in foreign languages. Because queries and documents in CLIR do not necessarily share the same language, translation is needed before matching can take place. This translation step tends to cause a reduction in the retrieval performance of CLIR as compared to monolingual information retrieval. The prevailing CLIR approach and the focus of this study is query translation. The translation of queries is inherently difficult due to the lack of a one-to-one mapping of a lexical item and its meaning, which creates lexical ambiguity. This, and other translation problems, result in translation errors which impact CLIR performance. To understand the events occurring in cross-language retrieval query translation and the relation of these events to retrieval performance, the study explored the following research questions: 1) What kinds of translation events affect cross-language retrieval? 2) In what way does the presence of certain translation events in query translation affect retrieval performance? The study followed a two-phase multi-method approach. In phase one, a taxonomy of translation events was created through content analysis of queries and their translations in combination with an examination of the literature. In the second and final phase, a subset of the test queries was coded using the taxonomy resulting from phase one. These queries were then used in information retrieval experimentation to assess the impact of the translation events on retrieval performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JASIS Forthcoming –Jiangping Chen A Lexical Knowledge Base Approach for English-Chinese Cross Language Information Retrieval

This study proposes and explores a natural language processing (NLP) based strategy to address out-ofdictionary and vocabulary mismatch problems in query translation based English-Chinese Cross Language Information Retrieval (EC-CLIR). The strategy, named the LKB approach, is to construct a lexical knowledge base (LKB) and to use it for query translation. This paper describes the LKB constructi...

متن کامل

A Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages

Translation ambiguity, out of vocabulary words and missing some translations in bilingual dictionaries make dictionary-based Crosslanguage Information Retrieval (CLIR) a challenging task. Moreover, in agglutinative languages which do not have reliable stemmers, missing various lexical formations in bilingual dictionaries degrades CLIR performance. This paper aims to introduce a probabilistic tr...

متن کامل

A lexical knowledge base approach for English-Chinese cross-language information retrieval

the LKB approach, is to construct a lexical knowledge base (LKB) and to use it for query translation. In this article, the author describes the LKB construction process, which customizes available translation resources based on the document collection of the EC-CLIR system. The evaluation shows that the LKB approach is very promising. It consistently increased the percentage of correct translat...

متن کامل

Equivalency and Non-equivalency of Lexical Items in English Translations of Nahj al-balagha

Lexical items play a key role in both language in general and translation in particular. Likewise, equivalence is a controversial concept discussed so widely in translation studies. Some theorists deem it to be fundamental in translation theory and define translation in terms of equivalence. The aim of this study is to identify the problems of lexical gaps in two translations of Nahj al-ba...

متن کامل

Web-Based Query Translation for English-Chinese CLIR

Dictionary-based translation is a traditional approach in use by cross-language information retrieval systems. However, significant performance degradation is often observed when queries contain words that do not appear in the dictionary. This is called the Out of Vocabulary (OOV) problem. In recent years, Web mining has been shown to be one of the effective approaches for solving this problem....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003